
„ '^^"f 1 ^ S y* tems Branch of the Scientific and Technical Information 
center (bTIC) detected errors when processing the following computer readable 
form: .. .. 

Application Serial Number: 0 ff/ 1 ?? ? ; /Jf 

Source: O ( 

Date Processed by STIC: jzj^J^DQl 



THE ATTACHED PRINTOUT EXPLAINS DETECTED ERRORS 

f. LE ^ raiS FORMATION TO THE APPLICANT BY EITHER: 

2> NOTICE ?ScSm^Y CANT ^ FAXING A C ° PY ° F THIS PWNTOUT ' WITH A 
FOR CRF SUBMISSION QUESTIONS, PLEASE CONTACT MARK SPENCER/ 703-308-4212. 

FOR SEQUENCE RULES INTERPRETATION, PLEASE CONTACT ROBERT WAX, 703-308-4216 
PATENMN 2.1 e-mail help: patin21helD @usDto.gov or phone 703-306-4119 (R. Wax) 
PATENTIN 3.0 e-mail help: patin3help @uspto.gov or phone 703-306-41 19 (R. Wax) 

TO REDUCE ERRORED SEQUENCE LISTINGS, PLEASE USE THE CHECKER 
VERSION 3.0 PROGRAM, ACCESSIBLE THROUGH THE U S PAT ENT AND 
TRADEMARK OFFICE WEBSITE. SEE BELOW 



^ Checker Version 3.0 

The Checker Version 3.0 application is a state-of the-art Windows based software proeram 
employing a logical and intuitive user-interface to check whether a sequence listine is in 
compliance with format and content rules. Checker Version 3.0 works for sequence listings 
generated for the original version of 37 CFR §§1.821 - 1.825 effective October l 1990 (old 
rules) and{he revised version (new rules) effective July 1, 1998 as well as World Intellectual 
Property Organization (WIPO) Standard ST.25. 

Checker Version 3.0 replaces the previous DOS-based version of Checker and is Y2K- 
comp^ Checker allows public users to check sequence listings in Computer Readable form 
(CRF) before submitting them to the United States PatenVand Trademark Office (USPTO) 
Use of Checker^norto filing the sequence listing is expected to result in fewer errored sequence 
listings, thus saving time and money. ^uence 



:c Listing Error Summary 



-fiETECTED 



SUCCESTED CORRECTION 



SERIAL NUMBER: 



o?/m f m 



ATTN: NEW RULES CASES: PLEASE DISREGARD ENGLISH "ALPHA" HEADERS, WHICH WERE INSERTED BY PTO SOFTWAP 



Wrapped Nucleic* 

Wrapped Aminos 



The numberAext at the end of each line "wrapped" down to the next line. This may occur if your file 
was retrieved in a word processor after creating it Please adjust your right margin to .3; this will 



Invalid Line Length The rules require that a line not c xceed 72 characters in length. This includes white spaces. 

Misaligned Amino 
Numbering 



_Palenl!n 3,0 
"bug" 



Skipped Sequena 

(OLD RULES) 



Skipped Sequences' 

' (NEW RULES) 



The numbering under each 5* amino acid is misaligned. Do hot use tab codes bctwi 
use space characters, instead. 

The submitted Tile was not saved in ASCII(DOS) text, as required by the Sequence Rules. Please 
ensure your subsequent submission Is saved In ASCII text. 

Sequences) contain n's or Xaa's representing more than, one residue. Per Sequence Rules. 

each it or Xaa can only represent a single residue. Please present the maximum number of each 
residue having variable length and indicate in the <220>-<223> section that some may be missing. 

A "bug" in Patentln version 2.0 has called ftft <220>-<223>«ection to be missing from amino acid 

sequencers) . Normally, Palemjh 'would* automatically, generate this section from the 

previously coded nucleic acid sequence. Please manually copy th« relevant <220>-<223> section to 
the subsequent amino acid sequence. This applies to (he mandatory <220>-<223> sections for 
Artificial or Unknown se 



Sequence^) missing. If intentional, please insert the following lines for each skipped sequence: 

(2) INFORMATION FOR SEQ ID NO:X: (insert SEQ ID NO where "X" is shown) 
(i) SEQUENCE CHARACTERISTICS: (Do not insert any subheadings under this heading) 

(xi) SEQUENCE DESCRIPTIONS EQ ID NO:X: (insert SEQ ID NO where "X" is shown) 
This sequence is intentionally skipped 

Please also adjust the "(ii) NUMBER OF SEQUENCES:" response to Include the skipped sequences. 

Sequencers) missing. If Intentional, please insert the following lines for each skipped sequence. 

<2 10> sequence id number 
<400> sequence id number 



_Useofn'i or Xaa's 
(NEW RULES) 



_!nvalid<2!3> 
Response 



_Palentln2.0 
"bug" 



Use of n's and/or Xaa's have been detected in the Sequence Listing. 

Per 1.823 of Sequence Rules, use of <220>-<223> is MANDATORY if n's or Xaa's are present. 

In <220> to <223> section, please explain location of n or Xaa, and which residue n or Xaa represents. 

Per 1.823 of Sequence Rules, the only valid <2 1 3> responses are: Unknown, Artificial Sequence, or 
scientific name (Genus/species). <220>-<223> section is required when <213> response is Unknown oi 
is Artificial Sequence 

_ missing the <220> "Feature" and associated numeric identifiers and responses. 



Use of<220> to <223> is MANDATORY if <2I3> "Organism" response is "Artificial Sequence" . 
"Unknown." Please explain source of genetic material in <220> to <223> section. 
(See "Federal Register," 06/01/1998, Vol. 63, No. 104, pp. 2963 1-32) (Sec 1.823 ofSequence Rules) 

Please do not use "Copy to Disk" function of Patentln version 2.0. This causes a corrupted file, 
resulting in missing mandatory numeric identifiers and responses (as indicated on raw sequence 
listing). Instead, please use "File Manager" or any other manual means to copy file to (loppy disk. 



n can only be used to represent a single nucleotide i 
any value not specifically a nucleotide. 



i a -nucleic acid sequence. N is not used to represent 
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RAW SEQUENCE LISTING 

PATENT APPLICATION: US/09/993,179 



DATE: 12/05/2001 
TIME: 09:54:30 



Input Set : 
Output Set: 



A: \sequence listing . txt 
N:\CRF3\11212001\I993179.raw 



Does Not Comply 
Corrected Diskette Needed 



3 <110> APPLICANT: McCarthy, Sean A. 

4 Kuranda , Michael Joseph I 

5 Bulawa, Christine Ellen fff- ' 

6 Bossone, Steven 

8 <120> TITLE OF INVENTION: METHOD FOR IDENTIFYING GENES ENCODING SIGNAL SEQUENCES 
10 <130> FILE REFERENCE: 09404/032001 

12 <140> CURRENT APPLICATION NUMBER: US/09/993,179 

13 <141> CURRENT FILING DATE: 2001-11-06 
15 <160> NUMBER OF SEQ ID NOS : 15 

17 <170> SOFTWARE: FastSEQ for Windows Version 3.0 



ERRORED SEQUENCES 



52 <210> SEQ ID NO: 2 

53 <211> LENGTH: 50 

54 <212> TYPE: PRT 

55 <213> ORGANISM: Homo sapiens 

57 <400> SEQUENCE: 2 

58 Met Lys Gly Thr cys Val lie Ala Trp Leu Phe Ser Ser Leu Gly Leu 
E--> 59 1 fa JS 

60 Trp Arg Leu Ala His Pro Glu Ala Gin Gly Thr Thr Gin Cys Gin Arg 
E--> 61 (~Z0 25 ^L^* 

62 Thr Xeu Glu Val Asn IleJtajLser Pro Ser Ser Lys Ala Thr Phe Ser 
E--> 63 V 35 40 45 ^/ 

.64 Pro Ser 



65 



50 



112 <210> SEQ ID NQ: 4 

113 <211> LENGTH: 125 





114 


<212> TYPE: PRT 






















115 


<213> ORGANISM: 


Homo sapiens 




















117 


<400> SEQUENCE: 


4 




















118 


Met 


Arg Ser Leu 


Leu Arg Thr Pro 


Phe 


Leu 


Cys 


Gly 


Leu 


Leu 


Trp 


Ala 


E--> 


119 


1 


5 


10 15 




















120 


Phe 


Cys Ala Pro 


Gly Ala Arg Ala 


Glu 


Glu 


Pro 


Ala 


Ala 


Ser 


Phe 


Ser 


E--> 


121 




20 25 


30 




















122 


Gin 


Pro Gly Ser 


Met Gly Leu Asp 


Lys 


Asn 


Thr 


Val 


His 


Asp 


Gin 


Glu 


E--> 


123 




35 


40 45 




















124 


His 


He Met Glu 


His Leu Glu Gly 


Val 


He 


Asn 


Lys 


Pro 


Glu 


Ala 


Glu 


E--> 


125 




50 55 


60 




















126 


Met 


Ser Pro Gin 


Glu Leu Gin Leu 


His 


Tyr 


Phe 


Lys 


Met 


His 


Asp 


Tyr 


E--> 


127 


65 


70 


75 80 




















128 


Asp 


Gly Asn Asn 


Leu Leu Asp Gly 


Leu 


Glu 


Leu 


Ser 


Thr 


Ala 


He 


Thr 


E--> 


129 




85 


90 95 




















130 


His 


Val His Lys 


Glu Glu Gly Ser 


Glu 


Gin 


Ala 


Pro 


Leu 


Glu 


Val 


Asn 


E--> 


131 




100 105 


110 




















132 


He 


Val Ser Pro 


Ser Ser Lys Ala 


Thr 


Phe 


Ser 


Pro 


Ser 









JU) 
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RAW SEQUENCE .LISTING DATE: 12/05/2001 

PATENT APPLICATION: US/09/993,179 TIME: 09:54:30 

Input Set : A:\sequence listing.txt 
Output Set: N:\CRF3\11212001\I993179.raw 

— > 133 115 120 125 

135 <210> SEQ ID NO: 5 

136 <211> LENGTH: 32 

137 <212> TYPE: PRT 

138 <213> ORGANISM: Mus musculus 

140 <400> SEQUENCE: 5 

141 Met Lys Gly Ala Cys He Leu Ala Trp Leu Phe Ser Ser Leu Gly Val 

--> 142 1 5 10 15 AAyyfJLs 

143 Trp Arg Leu Ala Arg Pro Glu Thr Gin Asp Pro Ala Lys Cys Gin Arg 
--> 144 20 25 30 

146 <210> SEQ ID NO: 6 

147 <211> LENGTH: 45 

148 <212> TYPE: PRT 

149 <213> ORGANISM: Homo sapiens 

151 <400> SEQUENCE: 6 

152 Met Ser Pro Gin Glu Leu Gin Leu His Tyr Phe Lys Met His Asp Tyr 

— > 153 i 5 io is ^^tXyyr^ 

154 Asp Gly Asn Asn Leu Leu Asp Gly Leu Glu Leu Ser Thr Ala lie Thr 
--> 155 20 25 30 

156 His Val His Lys Glu Glu Gly Ser Glu Gin Ala Pro Leu 
--> 157 35 40 45 

238 <210> SEQ ID NO: 14 

239 <211> LENGTH: 32 

240 <212> TYPE: PRT 

241 <213> ORGANISM: Homo sapiens 

243 <400> SEQUENCE: 14 

244 Met Lys Gly Thr Cys Val He Ala Trp Leu Phe Ser Ser Leu Gly Leu jy^Lsyl~ &y 
— > 245 1 5 10 15 • 

246 Trp Arg Leu Ala His Pro Glu Ala Gin Gly Thr Thr Gin Cys Gin Arg 
--> 247 20 25 30 

249 <210> SEQ ID NO: .15 

250 <211> LENGTH: 108 

251 <212> TYPE: PRT 

252 <213> ORGANISM: Homo sapiens 
254 <400> SEQUENCE: 15 





255 


Met 


Arg Ser 


Leu Leu Arg 


Thr 


Pro Phe Leu Cys Gly Leu 


Leu Trp 


Ala 


E--> 


256 


1 


5 


10 


L5 




















257 


Phe 


Cys Ala 


Pro Gly Ala 


Arg 


Ala Glu 


Glu 


Pro 


Ala 


Ala 


Ser 


Phe 


Ser 


E--> 


258 




20 


25 


30 




















259 


Gin 


Pro Gly 


Ser Met Gly 


Leu 


Asp Lys 


Asn 


Thr 


Val 


His 


Asp 


Gin 


Glu 


E— > 


260 




35 


40 45 






















261 


His 


He Met 


Glu His Leu 


Glu 


Gly Val 


He 


Asn 


Lys 


Glu 


Ala 


Glu 


Met 


E--> 


262 




50 ! 


55 60 






















263 


Ser 


Pro Gin 


Glu LeU Gin 


Leu 


His Tyr 


Phe 


Lys 


Met 


His 


Asp 


Tyr 


Asp 


E— > 


264 


65 


70 75 


\ 


}0 


















265 


Gly 


Asn Asn 


Leu Leu Asp 


Gly Leu Glu 


Leu 


Ser 


Thr 


Ala 


He 


Thr 


His 


E--> 


266 




85 


90 


95 




















267 


Val 


His Lys 


Glu Glu Gly 


Ser 


Glu Gin 


Ala 


Pro 


Leu 










E--> 


268 




100 


105 
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<400> 1 

ggggaccgtg tttgtggccc ,ccaagccggt gecccccatt ttggaactca gcgagtaggg 60 

ggcggctctg gggaagtggc agggggcgca gcagctgctg cctccacttc cctagccagg 120 

tgctgaagag gatcttcgga gccgctctgg cccccaggcg ctggatgact ggcaccagcg 180 

ctcctcgcac ctgtgttggt gtgtgagact tgggctggag tgcccacgtg gctgtggagt 240 

cagtgtgatt catgattgag gaaacgcgtc ctccatcctc tctctccttg gcactttcca 300 

cacatgagga gaagaagagc ttctgtttag aagacacgtg cccagagtca gaggcccctt 360 

gcccacc atg aag gga acc tgt gtt ata gca tgg ctg ttc tea age ctg 4 09 

Met Lys Gly Thr Cys Val lie Ala Trp Leu Phe Ser Ser Leu-^J - s . x 

i 5 io /yrun*- <f^^ JU^i^j 

ggg ctg tgg aga etc gee cac cca gag gee cag ggt acg act cag tgc 457 AMr&+J _ 

Gly Leu Trp Arg Leu Ala His Pro Glu Ala Gin Gly Thr Thr Gin Cys /^ZpUK^uLj 
15 20 25 30 0^**^1 

cag aga aca etc gag gtg aat att gtt tec ccc age tec aag gca aca 505 

Gin Arg Thr Leu Glu Val Asn lie Val Ser Pro Ser Ser Lys Ala Thr 
35 40 45 



ttc agt cca agt 
Phe Ser Pro Ser 



f2 



Use of n and/or Xaa has been detected In the Sequence Listing, 
£ Review the Sequence Listing to insure a corresponding 
explanation is presented in the <220> to <223> fields of 
each sequence using n or Xaa. 



VERIFICATION SUMMARY • 

PATENT APPLICATION: US/09/993,179 



DATE: 12/05/2001 
TIME: 09:54:31 



Input Set : A:\sequence listing.txt 
Output Set: N:\CRF3\11212001\I993179.raw 



L:12 M:270 C: Current Application Number differs, Replaced Current ^ Appli cati on Nu mber 

L:13 M:271 C: Current Filing Date differs, Rep 1 a cedrcjarrent Fi_ling_Sa te_I 

L:41 M:336 W: Invalid Amino Acid Number in- Coding Region, SEQ ID:1 

L:45 M:336 W: Invalid Amino Acid Number in Coding Region, SEQ ID:1 

L:49 M:336 W: Invalid Amino Acid Number in Coding Region, SEQ ID:1 

L:59 M:332 E: (32) Invalid/Missing Amino Acid Numbering, SEQ ID: 2 

M:332 Repeated in SeqNo=2 

L:81 M:336 W: Invalid Amino Acid Number in Coding Region, SEQ ID: 3 
L:85 M:336 W: Invalid Amino Acid Number in Coding Region, SEQ ID: 3 
L:89 M:336 W: Invalid Amino Acid Number in Coding Region, SEQ ID: 3 
L:93 M:336 W: Invalid Amino Acid Number in Coding Region, SEQ ID: 3 
L:97 M:336 W: Invalid Amino Acid Number in Coding Region, SEQ ID: 3 
L:101 M:336 W: Invalid Amino Acid Number in Coding Region, SEQ ID: 3 
L:105 M: 336 W: Invalid Amino Acid Number in Coding Region, SEQ ID: 3 
L:109 M:336 W: Invalid Amino Acid Number in Coding Region, SEQ ID: 3 
L:119 M:332 E: (32) Invalid/Missing Amino Acid Numbering, SEQ ID:4 
M:332 Repeated in SeqNo=4 

L:142 M:332 E: (32) Invalid/Missing Amino Acid Numbering, SEQ ID: 5 
M:332 Repeated in SeqNo=5 

L:153 M:332 E: (32) Invalid/Missing Amino Acid Numbering, SEQ ID:6 
M:332 Repeated in SeqNo=6 

L:209 M:257 W: Feature value mis-spelled or invalid, <221> Name/Key for SEQ ID# : 11 

L:214 M:341 W: (46) "n" or "Xaa" used, for SEQ ID#:11 

L:245 M:332 E: (32) Invalid/Missing Amino Acid Numbering, SEQ ID:14 

M:332 Repeated in SeqNo=14 

L:256 M:332 E: (32) Invalid/Missing Amino Acid Numbering, SEQ ID: 15 
M:332 Repeated in SeqNo=15 
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